A Large DataBase of Hypernymy Relations Extracted from the Web

نویسندگان

  • Julian Seitner
  • Christian Bizer
  • Kai Eckert
  • Stefano Faralli
  • Robert Meusel
  • Heiko Paulheim
  • Simone Paolo Ponzetto
چکیده

Hypernymy relations (those where an hyponym term shares a “isa” relationship with his hypernym) play a key role for many Natural Language Processing (NLP) tasks, e.g. ontology learning, automatically building or extending knowledge bases, or word sense disambiguation and induction. In fact, such relations may provide the basis for the construction of more complex structures such as taxonomies, or be used as effective background knowledge for many word understanding applications. We present a publicly available database containing more than 400 million hypernymy relations we extracted from the CommonCrawl web corpus. We describe the infrastructure we developed to iterate over the web corpus for extracting the hypernymy relations and store them effectively into a large database. This collection of relations represents a rich source of knowledge and may be useful for many researchers. We offer the tuple dataset for public download and an Application Programming Interface (API) to help other researchers programmatically query the

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

WebIsALOD: Providing Hypernymy Relations Extracted from the Web as Linked Open Data

Hypernymy relations are an important asset in many applications, and a central ingredient to Semantic Web ontologies. The IsA database is a large collection of such hypernymy relations extracted from the Common Crawl. In this paper, we introduce WebIsALOD, a Linked Open Data release of the IsA database, containing 400M hypernymy relations, each provided with rich provenance information. As the ...

متن کامل

A Web Application to Search a Large Repository of Taxonomic Relations from the Web

Taxonomic relations (also known as isa or hypernymy relations) represent one of the key building blocks of knowledge bases and foundational ontologies and provide a fundamental piece of information for many text understanding applications. Despite the availability of very large knowledge bases, however, some Natural Language Processing and Semantic Web applications (e.g., Ontology Learning) sti...

متن کامل

Associative and Semantic Features Extracted From Web-Harvested Corpora

We address the problem of automatic classification of associative and semantic relations between words, and particularly those that hold between nouns. Lexical relations such as synonymy, hypernymy/hyponymy, constitute the fundamental types of semantic relations. Associative relations are harder to define, since they include a long list of diverse relations, e.g., “Cause-Effect”, “Instrument-Ag...

متن کامل

LINGUISTIC DESCRIPTION IN DICTIONARIES: SEMANTICS Extraction of semantic relations from a Basque monolingual dictionary using Constraint Grammar

This paper deals with the exploitation of dictionaries for the semi-automatic construction of lexicons and lexical knowledge bases. The final goal of our research is to enrich the Basque Lexical Database with semantic information such as senses, definitions, semantic relations, etc., extracted from a Basque monolingual dictionary. The work here presented focuses on the extraction of the semanti...

متن کامل

Extraction of semantic relations from a Basque monolingual dictionary using Constraint Grammar

This paper deals with the exploitation of dictionaries for the semi-automatic construction of lexicons and lexical knowledge bases. The final goal of our research is to enrich the Basque Lexical Database with semantic information such as senses, definitions, semantic relations, etc., extracted from a Basque monolingual dictionary. The work here presented focuses on the extraction of the semanti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016